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Competing endogenous RNA (ceRNA) interactions form a multilayered network that regulates gene 
expression in various biological pathways. Recent studies have demonstrated novel roles of ceRNA 
interactions in tumorigenesis, but the dynamics of the ceRNA network in cancer remain unexplored. Here, 
we examine ceRNA network dynamics in prostate cancer from the perspective of alternative cleavage and 
polyadenylation (APA) and reveal the principles of such changes. Analysis of exon array data revealed that 
both shortened and lengthened 3'UTRs are abundant. Consensus clustering with APA data stratified 
cancers into groups with differing risks of biochemical relapse and revealed that a ceRNA subnetwork 
enriched with cancer genes was specifically dysregulated in high-risk cancers. The novel connection between 
3'UTR shortening and ceRNA network dysregulation was supported by the unusually high number of 
microRNA response elements (MREs) shared by the dysregulated ceRNA interactions and the significantly 
altered 3'UTRs. The dysregulation followed a fundamental principle in that ceRNA interactions connecting 
genes that show opposite trends in expression change are preferentially dysregulated. This targeted 
dysregulation is responsible for the majority of the observed expression changes in genes with significant 
ceRNA dysregulation and represents a novel mechanism underlying aberrant oncogenic expression. 

MicroRNAs are approximately 22 -nucleotide RNAs that regulate gene expression through complement- 
ary binding with microRNA response elements (MREs) in target transcripts 1 . Although positive influ- 
ences on target transcripts have been reported 2 , microRNAs are largely negative regulators of protein 
production whose effects predominantly involve the destabilization of target mRNAs 3 . The coherent role of 
microRNAs in mRNA regulation and their many- to-many interaction paradigm allow microRNAs to function as 
common resources for which different transcripts compete 4 ' 5 . Such interactions have been termed the ceRNA 
network and represent a novel form of regulation. Multiple studies have demonstrated that ceRNA networks 
regulate essentially all known biological processes, and their dysregulation could represent novel disease 
mechanisms 6 " 8 . 

Although coding regions and 5'UTRs have been reported to harbor MREs 910 , the vast majority of microRNA - 
mRNA interactions are mediated by 3'UTRs 1 . Polymorphism in 3'UTR regions is pervasive throughout the 
human genome, as about half of all known genes are estimated to undergo polyadenylation (APA) 11 . Because long 
3'UTRs tend to harbor more MREs, changing the sizes of 3'UTRs could influence key biological processes by 
strengthening or weakening the repressive effects of microRNAs. Sandberg et al. first described the profound role 
of 3'UTR APA dynamics in T cell proliferation 12 . Through clever probe-level analysis of microarray data, they 
showed that proliferating T cells utilized shorter 3'UTRs compared to their resting counterparts. The shortening 
of 3'UTRs enables key genes to escape microRNA repression, thus leading to higher expression and promoting 
proliferation. Additionally, 3'UTR shortening has been established as a key mechanism of oncogene activation 
and has demonstrated promising potential as a prognostic marker 13 " 16 . In addition to 3'UTR shortening, neural- 
specific lengthening of 3'UTRs has been reported during Drosophila development and in the mammalian 
brain 1718 , suggesting that there is considerable versatility in 3'UTR APA dynamics in diverse biological processes. 

Because 3'UTRs are essential building blocks of the ceRNA network (together with microRNAs) 5 , we hypothe- 
sized that the dynamics of 3'UTR APAs modulate the structure and strength of the relevant ceRNA networks. We 
applied the Bayesian change point (BCP) approach 19 ' 20 to analyze tandem 3'UTR APA dynamics using a large 
prostate cancer dataset 21 . Our results demonstrated that prostate cancers can be stratified into subsets with 
differing risks of biochemical relapse according to 3'UTR APA dynamics. A densely connected sub-network 
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enriched with prostate cancer genes was specifically dysregulated in 
high-risk prostate cancers. The dysregulation is not random; rather, 
it preferentially targets genes showing opposite expression changes, 
effectively shifting the total ceRNA balance and driving gene up-/ 
downregulation. These results demonstrated that in addition to 
direct effects on mRNAs, 3'UTR APA dynamics can exert profound 
influences on ceRNA networks; furthermore, the results, revealed a 
fundamental rule governing the dysregulation of the ceRNA network 
in cancer. 

Results 

Bayesian change point analysis of exon array data reveals complex 
3'UTR APA dynamics in prostate cancer. We first established the 
3'UTR APA landscape across a large cohort of prostate cancers 
consisting of 185 exon arrays from normal samples, primary and 
metastatic cancers and cell lines 21 . We focused on tandem 3'UTRs 
because multiple studies have supported their roles in tumorigene- 
sis 13,15 ' 22 , and several studies have successfully analyzed tandem 
3'UTR shortening with exon array data 12,23 . The exon array probes 
were first mapped to hgl9 using methods similar to PLATA 12 and 
Rmodel 23 (Fig. la). Instead of applying a modified f-test to individual 
samples in a manner similar to those used in previous studies, we 
adopted a multivariate Bayesian approach such that for each tandem 
3'UTR, all samples were considered in the same calculation (see 
Methods). Furthermore, the test estimated the probes' posterior 
probabilities as the change point and their posterior means in the 
same Bayesian procedure, thus substantially reducing the number of 



statistical tests (Fig. lb). The estimated 3'UTR shortening/leng- 
thening values were then filtered to retain only those with signifi- 
cant change probabilities and median absolute deviations (MADs), 
resulting in 279 tandem 3'UTRs. 

The shortening of 3'UTRs has been predominantly associated 
with elevated gene expression 1213 . However, a recent study has shown 
that 3'UTR sizes have limited effects on mRNA stabilities 24 . Taking 
advantage of the large dataset, we demonstrated that the correlation 
coefficient between 3'UTR shortening and gene expression clearly 
followed a bimodal distribution (Fig. lc). As expected, the majority of 
the genes demonstrated a strong positive correlation. However, a 
subset of 51 genes displayed a significant negative correlation, indi- 
cating that longer 3'UTRs could be associated with higher gene 
expression (we termed those the "negative gene set"). Those genes 
included the known oncogene RUNX1 (Supplementary Fig. 1) and 
prostate cancer- associated genes KLK2 and KLK3. Although func- 
tional enrichment analysis did not reveal significant results, most of 
those tandem 3'UTRs (42 out of 51) did fall within the same cluster 
when hierarchical clustering was performed using the 3'UTR APA 
dynamics data (Fig. 2c), suggesting that their 3'UTR lengths are 
regulated in concert. We further explored possible mechanisms link- 
ing longer 3'UTRs with increased gene expression via motif enrich- 
ment analysis. Several hexamers representing C-rich elements were 
significantly enriched in sequence regions that follow the identified 
APA sites in the negative set genes (Supplementary Table 1). C-rich 
elements are recognized by poly(C) -binding proteins and have been 
predominantly designed as stabilizers of mRNAs 25 . Thus, this bimo- 



Scale 

chr2: 145,141,5001 



2kb |- 



145,142,5001 



145,143,5001 



145,144,5001 



1 hg19 

145,145,5001 



145,146,5001 



145,147,5001 



probel 130188 1 
probe2832177l 
probe36410 I 
probe2973171 I 
probel 569966 1 



probe5651088l 



probe551 961 9 1 probe21 63851 1 
probe205073 I probe2786485 1 
probe5360908 1 

probe6434352 1 



probel 894558 1 
probe3119285l 
probe3869132l 
probe671909 I 



probe6321 590 1 probe386581 8 
probe2410230l 
probe6489613l 
probe82143 I 



Basic Gene Annotation Set from ENCODE/GENCODE Version 19 



ZEB2 
ZEB2 
ZEB2 



Posterior means and probabilities of a change 




c 

CD 

Q o 



-Q 
CO 



O 



O 
O 




10 
Location 



Correlation coefficient 
between mRNA expression and 3'UTR shortening 



Figure 1 | Bayesian change point approach for APA analysis with exon array data, (a) Mapping of exon array probes to ZEB2 3'UTR. (b) BCP 
analysis results. The upper panel shows the input probe intensities (dots) and posterior probe mean intensities (solid lines) for all samples. The lower 
panel shows the posterior change point probabilities for the probes, (c) The correlation coefficient between gene expression and 3'UTR shortening 
follows a bimodal distribution. The histogram represents the distribution of the correlation coefficient (blue). The estimated densities are shown for 
individual (magenta) and combined (turquoise) distributions. 
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Figure 2 | APA dynamics defines stable clusters with differing risks of biochemical relapse, (a) Consensus clustering matrix of prostate cancer samples 
for k = 2 to k = 5. (b) Consensus clustering CDF for k = 2 to k = 5. (c) Heatmap of the clustering result, (d) Survival analysis using classifications 
generated from consensus clustering. Cluster 1 displays a significantly higher probability of relapse. 



dal pattern indicated that mRNA stabilities are context dependent, 
and both microRNAs and RNA binding proteins could be the dom- 
inant factors. 

Consensus clustering of 3'UTR changes defines distinct prostate 
cancer subtypes with differing risks of relapse. We then 
investigated whether 3'UTR shortening data can produce 
biologically relevant stratification of prostate cancers. Consensus 
clustering generated four stable clusters (Fig. 2a, 2b). There are 
strong differences in biochemical relapse probabilities among the 
four subtypes (Fig. 2d). For the cluster with highest relapse 
probability, the samples were mostly metastatic tumors. The high- 
risk cluster also displayed the largest absolute change in 3'UTR 
shortening/lengthening (Fig. 2c). Prostate cancer cell lines also 
appeared in the same cluster. Moreover, all high-risk samples are 
from the two copy number variation (CNV) clusters with the 
largest CNV changes that have been reported previously 21 , thus 
confirming their status as advanced tumors. For the majority of 
the genes (including known oncogenes such as MYC, RUNX1, 
etc.), the high-risk cluster samples showed the highest expression 
of those genes (having shorter 3'UTRs, or longer 3'UTRs for the 
negative set genes). However, a subset of genes showed the 



opposite trend (Fig. 2c): high-risk cluster samples and metastatic 
samples displayed lower expression (longer 3'UTRs). This subset 
contained several EMT-related genes such as PDGFC, TGFBR1 
and ZEB2. Because ZEB2 has been designated as a key regulator of 
EMT via the antagonizing microRNA 200 family members 26 , we 
further investigated the 3'UTR lengthening of ZEB2 in metastatic 
samples and cell lines (Supplementary Fig. Id). We also analyzed the 
expression of PDGFD, a known growth factor expressed in normal 
prostate tissues that could upregulate ZEB2 27 . Those two genes 
displayed a significant positive correlation, and both genes were 
expressed at a lower level in cell lines and metastatic tumors 
(Supplementary Fig. le). This finding raised the interesting 
possibility that reduced expression of the PDGFD-ZEB2 axis may 
render cells more sensitive to EMT stimuli, thus leading to higher 
relapse probabilities. 

Targeted dysregulation of ceRNA networks in high-risk prostate 
cancers. We next examined the dysregulation of the ceRNA network 
in these subtypes via a mutual information- (MI-) based approach. A 
subnetwork of 5,185 significantly dysregulated ceRNA interactions 
was identified in the high-risk cluster (Fig. 3a, Table 1). The 
directions of the changes were homogeneous, in that removing 
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high-risk samples substantially increased Mis for all dysregulated 
gene pairs (Table 1), indicating that the ceRNA network was 
uniformly weakened in high-risk cancers. In contrast to the high- 
risk cluster, minimal numbers of significantly dysregulated ceRNA 
interactions were identified for the other three groups (Table 1). This 
observation suggested that substantial dysregulations of the ceRNA 
network are only evident in samples with large 3'UTR length dyna- 
mics, supporting our initial hypothesis. In addition, the medium-risk 
group was dominated by genes utilizing longer 3'UTRs (Fig. 2c, 
cluster 4) and was the only group with significantly strengthened 
ceRNA interactions (Table 1). This pattern further supported the 
idea that shortening 3'UTRs would release micro RN As and 
weaken the relevant ceRNA interactions. 

The dysregulated network in high-risk prostate cancers defined 
182 genes with unusually high numbers of dysregulated ceRNA 
interactions (Fig. 3a). These included well-known cancer genes such 
as PTEN, AKT3 and CDC25A. We first looked for pathways 
enriched in these 182 dysregulated genes. DAVID 28 analysis iden- 
tified 21 pathways (q- value < 0.05, Supplementary Table 2). Among 
the top identified were key pathways involved in tumorigenesis, 
including those in prostate cancer. We next analyzed the number 
of prostate cancer records in PubMed that were associated with these 
182 genes. Compared to simulated random gene sets, the set of 182 



genes returned on average 20.47 PubMed records, which was signifi- 
cantly higher than the random gene sets (p- value = 4.19E-11). These 
results confirmed that the dysregulated ceRNA network in high-risk 
prostate cancers specifically targeted cancer-related genes and 
pathways. 

Dysregulation of the ceRNA network is associated with 3'UTR 
APA dynamics. A straightforward explanation for the observed dys- 
regulation is that the mediating microRNAs for those interactions 
are upregulated. To test this possibility, we first compared the 
microRNA expression in samples strongly displaying ceRNA net- 
work dysregulation (cluster 1) with those showing few ceRNA 
network changes (clusters 2, 3 and 4) (Supplementary Fig. 2a). 
Significant Analysis of Microarray 29 (SAM) identified 29 different- 
ially expressed microRNAs (q- value < 0.01). Among those micro- 
RNAs, only 6 were over- expressed in dysregulated samples. This 
result indicated that in this particular dataset, differentially 
expressed microRNAs are not likely to dysregulate the ceRNA 
network because a reduced (but not depleted) level of microRNAs 
helps to strengthen ceRNA interactions. Additionally, we compared 
the set of differentially expressed microRNAs with the microRNAs 
that had been predicted to mediate the dysregulated ceRNA 
interactions (Supplementary Fig. 2b). Only approximately 2.3% of 
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Figure 3 | Targeted dysregulation of the ceRNA network, (a) The dysregulated ceRNA network in high-risk prostate cancers with 5,185 dysregulated 
ceRNA interactions. Node colors represent expression changes, and edge colors represent the significance of AMI. The node sizes of the 182 significantly 
dysregulated genes are proportional to the number of dysregulated ceRNA interactions. The node sizes for the non- significant genes are set to a small 
value to allow visual separation between significant and non- significant genes, (b, c and d) Enrichment of genes displaying opposite expression 
change in dysregulated ceRNA interactions for PTEN (b), CDC42 (c) and AKT3 (d). Only genes with significant expression change between high-risk and 
low-risk cancer samples were considered (SAM q-value < 0.05). Numbers under arrows indicate the counts for upregulated (upward facing arrows, 
magenta) and downregulated (downward facing arrows, turquoise) genes, (e) Expression changes of dysregulated genes display a strong correlation with 
the enrichment of genes with opposite directions of expression change in their dysregulated ceRNA interactions. The size of each dysregulated gene is 
proportional to its -log 10( enrichment p-value). The dotted line represents the cutoff for significant enrichment (p-value = 0.01). 
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the dysregulated ceRNA interactions had more than 20% overlap 
with the differentially expressed micro RNAs. This small overlap 
further supported the conclusion that differentially expressed 
microRNAs are not responsible for the observed ceRNA network 
dysregulation. 

We next examined whether the 279 3'UTRs with significant APA 
dynamics shared any microRNAs with the set of dysregulated 
ceRNA interactions. The MREs in the extended 3'UTR regions of 
the 279 genes were collected from computational predictions 
(TargetScan 30 , PITA 31 and miRanda 32 ). Overlaps between this set 
of MREs and the MREs predicted to mediate dysregulated ceRNA 
interactions were computed (Supplementary Fig. 2c, 2d). Compared 
to randomly selected ceRNA network interactions, the disrupted 
interactions in high-risk prostate cancers displayed a significantly 
higher number of total overlaps, as well as a strong bias toward a 
larger number of overlapping MREs, further strengthening the idea 
that 3'UTR dynamics could mediate the dysregulation of the ceRNA 
network. 

Dysregulated ceRNA interactions preferentially target genes with 
opposite directions of expression change. Because the core function 
of the ceRNA network is to regulate gene expression, we next 
investigated the principles governing the observed dysregulations. 
We first applied SAM to identify genes that were differentially 
expressed between high-risk and low-risk cancers (q-value < 
0.05). We then examined the PTEN ceRNA network in detail 
because it is one of the most well-studied cancer genes. As 
expected, PTEN was slightly downregulated in high-risk prostate 
cancers (fold change = —0.40, SAM q-value = 0.12), which is 
consistent with its known haploinsufficiency 33 . We then compared 
differentially expressed genes that are directly connected with PTEN 
in dysregulated and unchanged ceRNA interactions. While there are 
larger numbers of downregulated genes in the unchanged ceRNA 
interactions, there are twice as many genes upregulated in the 
dysregulated interactions (Fig. 3b). Although the difference is not 
significant (p-value = 0.24), the trend indicates that dysregulated 
ceRNA interactions preferentially interrupt the regulatory relation- 
ship between PTEN (downregulated) and upregulated genes. The 
trends are much more obvious for CDC42 (Fig. 3c) and AKT3 
(Fig. 3d). Both demonstrated significant enrichment of genes 
having opposite trends in expression change in dysregulated 
ceRNA interactions. To gain a systems -level insight, we analyzed 
all 182 significantly dysregulated genes for such enrichment 
(Fig. 3e). For 145 of the 182 dysregulated genes, significant (p- 
value < 0.05) consistent enrichments were observed: if the gene is 
up/downregulated, the dysregulated ceRNA interactions are en- 
riched for down/up regulated genes. Re- sampling analysis indicated 
that over 94% of SAM identified genes were consistently observed, 
confirmed that the observed dysregulation principle is not due to 
variations in differential expression detection. Twenty- six genes 
showed a consistent (albeit not significant) trend similar to that of 
PTEN. Only 11 of the 182 dysregulated genes exhibited the opposite 
trend (four were significant) of dysregulated ceRNA interactions 



enriched for genes with the same trend in expression change as the 
dysregulated gene. However, none of the 11 dysregulated genes 
displayed a significant change in expression (smallest SAM q-value 
= 0.17), and the magnitudes of fold change were all close to zero, 
suggesting that those anomalies are not representative. 

We next investigated whether the enrichment of genes with 
opposite expression change could contribute to the observed 
expression change of the 182 significantly dysregulated genes. We 
first examined the contribution by driving CNV to the observed 
expression changes. Although there was a significant positive cor- 
relation between CNV and expression change for the 182 dysregu- 
lated genes, the correlation coefficient was smaller and less 
significant than that of ceRNA dysregulation enrichment (0.49 vs. 
— 0.64, Supplementary Fig. 3). Moreover, multivariate linear regres- 
sion analysis demonstrated that more than 60% of the explained 
variance was due to ceRNA dysregulation enrichment (Supplement- 
ary Fig. 4). This result clearly showed that for the 182 genes with 
significant ceRNA dysregulation, their expression changes were 
mostly due to preferential disruption of ceRNA interactions connect- 
ing genes with opposite trends in expression changes. Overall, 84 of 
the 182 dysregulated genes displayed significant expression changes 
(SAM q-value < 0.05). The list can be further expanded with hap- 
loinsufficient genes such as PTEN, indicating that ceRNA dysregula- 
tion could have a significant impact on aberrant gene expression in 
high-risk prostate cancers. 

Discussion 

In contrast to the well-studied signal transduction and transcription 
regulatory networks, the properties and dynamics of the emerging 
ceRNA network remain elusive. Recent studies have shown that 
ceRNA network interactions are strongest when the interaction part- 
ners are expressed near threshold level 34 , indicating that the under- 
lying ceRNA network could be considerably dynamic. From the 
perspective of the ceRNA network, gene expression levels reflect 
the dynamic balance among the many ceRNA regulatory partners 4,34 . 
In contrast to single gene-based disruptions such as upregulation of 
TGFBR3 by HMGA2 35 , we showed, from a systems perspective, that 
ceRNA-based aberrant oncogenic expression can also be achieved by 
specifically removing genes with opposite trends in expression from 
the ceRNA equation. Despite the presence of thousands of dysregu- 
lated ceRNA interactions, we demonstrated that simple rules govern 
such changes. These findings not only substantially enhance our 
understanding of ceRNA networks but also could serve as inspira- 
tions to reveal the rules governing the dynamics of other types of 
biological networks. 

Because 3'UTRs are the main entities containing MREs and are 
thus the essential building blocks of the ceRNA network, the theory 
that changes in 3'UTRs could influence the underlying ceRNA net- 
work is logically grounded. However, given the enormous diversity in 
3'UTRs and the large number of participating microRNAs, global 
changes in 3'UTRs must be non-random to induce dysregulation of 
the ceRNA network. Our analysis of 3'UTR shortening patterns as 
well as the unusually high number of shared MREs between altered 
3'UTRs and dysregulated ceRNA interactions support this hypo- 
thesis. Although our analysis was constrained to only a subset of 
all possible 3'UTRs due to the limited genomic coverage of exon 
array probes, the presence of dysregulated ceRNA networks in 
high-risk prostate cancers indicates that 3'UTR dynamics directly 
influence the ceRNA network by modulating the pool of microRNAs 
targeting dysregulated ceRNA interactions. The novel connection 
between alternative splicing and ceRNA network dysregulation fur- 
ther highlights the advantages of approaching tumorigenesis from a 
systems perspective. 

A key observation from our findings with important clinical impli- 
cations is that significant ceRNA dysregulation is evident only in 
high-risk prostate cancers; this finding may indicate that remarkable 
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dynamic robustness exists in the ceRNA network. Such robustness is 
partly due to the scale-free topology 36 similar to that in many other 
biological networks, and it could be further enhanced by the intrinsic 
redundancy among ceRNA interactions. This observation suggests 
that constraining ceRNA may be one of the more difficult barriers to 
overcome during tumorigenesis, and as a late-stage event, its dysre- 
gulation may serve as a novel biomarker for prognosis. 

Methods 

Generation of tandem 3'UTR dataset. To identify the set of transcripts with tandem 
3'UTRs that could be analyzed with the Affymetrix Human Exon 1.0 ST Array, we 
first queried the Ensembl database with the following filters: Transcript count > = 2 
and with Affymetrix Microarray huex 1.0 st v2 probeset IDs. We then used the UCSC 
table browser to retrieve the 3'UTRs of the returned Ensembl Transcript IDs. For 
each gene, the genomic coordinates of its 3'UTRs were compared, and tandem 
3'UTRs were identified (3'UTRs with the same start position and different APA 
sites). 

We used a similar approach, as described previously, to identify probes that could 
measure 3'UTR APA dynamics 12 ' 23 . Briefly, the probes of the Affymetrix Human 
Exon 1.0 ST Array were mapped to the hgl9 genome. Probes mapped to multiple 
genomic locations were discarded. The number of uniquely mapped probes in each 
tandem 3'UTR was examined. The set of tandem 3'UTRs with at least two probes 
before and after APAs was kept for 3'UTR expression analysis, resulting in 7,059 
tandem 3'UTRs. 

Microarray Data. The raw Affymetrix Human Exon 1.0 ST Array data files were 
obtained from the Gene Expression Omnibus (GEO) accession GSE21034. The 
processed mRNA and microRNA expression data from the same study were 
downloaded from the MSKCC Prostate Cancer Genomics Data Portal (http://cbio. 
mskcc.org/prostate-portal/, Date of access: 10/10/2011). Gene level CNV data were 
obtained from cBioPortal 37 . 

Exon array data analysis. The CEL files were processed with aroma.affymetrix 38 
using RMA background correction and quantile normalization. The processed probe 
intensities were extracted from the intermediate CEL files to allow probe-level 
analysis. The extracted probe intensities were first log2 transformed. We then applied 
a similar approach to that outlined in PLATA 12 to normalize the probe-level data. 
Briefly, the probe intensities were first normalized to the median intensity of all 
probes mapped to the transcript. The gene-level normalized probe intensities were 
then mean centered to remove probe- specific effects. 

Calculation of tandem 3'UTR APA expression changes. The cancer sample probe 
intensities were first normalized to the median of the normal sample probe intensities 
to create relative fold changes. We then modeled the tandem 3'UTR APA dynamics 
problem as a change point problem and utilized the R package BCP 20 to perform 
Bayesian analysis of the change points. Briefly, the approach treated all samples for a 
tandem 3'UTR as a multivariate series with a common change point, and an n 
(number of probes for the 3'UTR) by m (number of samples) matrix was provided as 
algorithm input. The position (probe) with the highest posterior change probability 
was chosen as the change point. The probes were then partitioned into common and 
extended groups using the identified change point. For each group, the median of the 
posterior means of all probes was calculated, and the fold change in the expression 
between the common and extended regions was calculated as the difference between 
the two medians. Positive numbers indicated higher expression of common regions 
(shortening), and negative numbers indicted higher expression of extended regions 
(lengthening). 

Tandem UTR expression data filtering. Several filtering steps were applied to 
remove unreliably measured tandem 3'UTR expression changes. We first selected 
tandem 3'UTRs with significant change points. The largest change point posterior 
probabilities for all tested tandem 3'UTRs were collected. Assuming that the majority 
of the detected changes were not significant, we estimated the mean and standard 
deviation by fitting the data to a normal distribution. The z- score for each change 
point according to the estimated normal distribution was then calculated. We then 
applied a cutoff of z-score > 2 to select tandem 3'UTRs harboring significant 
changes. Similarly, tandem 3'UTRs with a MAD z-score < 2 were excluded from 
further analysis. We then examined the selected tandem 3'UTRs for multiple 
significant change points. If such change points in a tandem 3'UTR were 
discontinuous, the tandem 3'UTR was excluded from further analysis because the 
discontinuity represented complex splicing patterns and did not reflect simple 3 'UTR 
shortening or lengthening. 

Clustering analysis. We utilized the Bioconductor package ConsensusClusterPlus 39 
to identify robust clusters. Hierarchical clustering was performed with Ward's 
minimum variance method and Euclidean distance. The procedure was run using the 
279 filtered tandem 3'UTRs over 1000 iterations and with a sub-sampling ratio of 0.8. 

Dysregulation analysis of ceRNA networks. The ceRNA network developed by 
Sumazin et al. 6 was downloaded from the publication website. We used an approach 
similar to the IDEA algorithm 40 developed by Mani et al. to analyze the perturbation 



of the ceRNA network coupled with 3'UTR APA dynamics. Briefly, Mis between 
ceRNA gene pairs were first calculated using all prostate cancer samples (MI flH ). 
Samples for each subtype identified in the consensus clustering analysis were then 
removed, and the Mis were recalculated (Ml a u_ k ). A positive AMI (Ml a u_ k - Ml aU ) 
indicated that samples of subtype k weakened the underlying ceRNA interaction. We 
then calculated the significance of AMI using the permutation approach outlined in 
the IEAD method and applied a Bonferroni-corrected p- value of 0.01 to select 
significantly altered ceRNA interactions. We further filtered the identified 
interactions and retained only those with positive correlation because they 
represented modulators with a strong sponge effect. Finally, Fisher's exact test was 
performed to select genes with a significant number of dysregulated ceRNA 
interactions using a Bonferroni-corrected p-value of 0.01. The final dysregulated 
network was filtered to only contain those genes that were directly connected to the 
significant genes. 

Differential expression and driving CNV analysis. We adopted the re-sampling 
approach outlined in Multiple Survival Screening (MSS) 41 to identify the set of genes 
demonstrating robust differential expression. Briefly, the ratio of high-risk vs. low- 
risk cancer samples was maintained and 1000 rounds of re- sampling were performed 
with a sub-sampling ration of 0.8. For driving CNV analysis, we adopted the 
definition established by Zaman et al. 42 . Briefly, a driving CNV has a GISTIC score > 
0.3, plus among top (for amplification) or bottom (for deletion) 50% of the expressed 
genes. 

Statistical and microRNA analysis of dysregulated ceRNA networks. Network 
visualizations were produced with Cytoscape 43 . The pathway enrichment analysis was 
performed with the NCBI DAVID tool 28 . For PubMed analysis, PubMed records were 
retrieved using the keywords "prostate cancer" and the query gene's official gene 
symbol. For genes whose names matched regular English words (such as REST or 
SET), the full gene names were used. Significance was assessed with 10,000 
simulations using randomly selected genes. To calculate the significance of 
overlapping microRNAs between dysregulated ceRNA network interactions and the 
279 3'UTRs, a Chi-squared test was performed using the average overlapping data 
from 10,000 random ceRNA networks as a reference distribution. All statistical 
computations were performed in R. 
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